Home | Data Overview | Multi-Dimensional Analysis | Models | Conclusion

Let’s check out Self-Efficacy models.

Major

Let’s see the influence that the difference in majors have an the average of Self-Efficacy questions:

## 
## Call:
## glm(formula = self_efficacy_ave ~ major, data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.2592  -0.3766   0.2234   0.7408   1.2375  
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        4.25915    0.11797  36.104  < 2e-16 ***
## majorChemistry    -0.31630    0.39379  -0.803 0.422482    
## majorComputer_sci  0.14085    0.39379   0.358 0.720844    
## majorEarth_sci     0.74085    1.00100   0.740 0.459809    
## majorEngineering   0.02085    0.28247   0.074 0.941222    
## majorHealth_sci   -0.48253    0.14259  -3.384 0.000808 ***
## majorMathematics  -0.15915    0.71271  -0.223 0.823446    
## majorNon_STEM     -0.34611    0.23849  -1.451 0.147739    
## majorOther        -0.49665    0.21165  -2.347 0.019588 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.9880846)
## 
##     Null deviance: 316.14  on 311  degrees of freedom
## Residual deviance: 299.39  on 303  degrees of freedom
## AIC: 892.55
## 
## Number of Fisher Scoring iterations: 2


This shows that those who study Health Science and Other majors answer significantly lower Self-Efficacy than the rest of the groups of majors.

This is great, but it is difficult to understand this in context until we compare it to the other models.

Career

Now, let’s look at how career goals affect the average of Self-Efficacy questions:

## 
## Call:
## glm(formula = self_efficacy_ave ~ career, data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.1028  -0.3426   0.2500   0.6972   1.2574  
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            4.1022222  0.1491172  27.510   <2e-16 ***
## careereducator        -0.2868376  0.3149707  -0.911   0.3632    
## careerengineer         0.0844444  0.2982344   0.283   0.7773    
## careerhealth_care_pro -0.3595993  0.1744641  -2.061   0.0401 *  
## careermedical_doctor   0.0005556  0.1900879   0.003   0.9977    
## careernon_stem        -0.3522222  0.2688249  -1.310   0.1911    
## careerresearcher       0.4120635  0.4064250   1.014   0.3115    
## careerscientist       -0.0355556  0.3249934  -0.109   0.9130    
## careertechnician       0.1977778  0.4347476   0.455   0.6495    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 1.000617)
## 
##     Null deviance: 316.14  on 311  degrees of freedom
## Residual deviance: 303.19  on 303  degrees of freedom
## AIC: 896.48
## 
## Number of Fisher Scoring iterations: 2


This shows that those pursuing Healthcare Professional careers have lower Self-Efficacy than the rest of the career goals.

Now let’s run from another model’s point of view.

Ethnicity

How about we look at how different ethnicities affect Self-Efficacy?

## 
## Call:
## glm(formula = self_efficacy_ave ~ ethnicity, data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.0077  -0.4077   0.1556   0.7923   2.6000  
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 2.400e+00  4.372e-01   5.489 8.56e-08 ***
## ethnicityamerican_indian    1.400e+00  6.558e-01   2.135 0.033589 *  
## ethnicityarab               2.600e+00  1.071e+00   2.428 0.015778 *  
## ethnicityasian              1.644e+00  5.453e-01   3.016 0.002782 ** 
## ethnicitybiracial           1.844e+00  5.453e-01   3.382 0.000813 ***
## ethnicitylatinx             1.318e+00  4.974e-01   2.649 0.008493 ** 
## ethnicityother             -2.621e-14  1.071e+00   0.000 1.000000    
## ethnicitypacific_islander   1.333e-01  7.140e-01   0.187 0.851983    
## ethnicityprefer_not_answer  6.500e-01  6.558e-01   0.991 0.322422    
## ethnicitywhite              1.608e+00  4.414e-01   3.642 0.000318 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.9557959)
## 
##     Null deviance: 316.14  on 311  degrees of freedom
## Residual deviance: 288.65  on 302  degrees of freedom
## AIC: 883.15
## 
## Number of Fisher Scoring iterations: 2


This shows that American Indians, Arabs, Asians, those who are Biracial, Latinx, and Caucasians have significantly higher Self-Efficacy.

One last model. Let’s check it out.

Medical Conditions

Here is the model according to a presence of medical conditions or not.

## 
## Call:
## glm(formula = self_efficacy_ave ~ med_condition, data = df)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.93846  -0.33846   0.06154   0.82564   1.06154  
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       3.93846    0.06112  64.443   <2e-16 ***
## med_conditionYes  0.03590    0.17286   0.208    0.836    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 1.019679)
## 
##     Null deviance: 316.14  on 311  degrees of freedom
## Residual deviance: 316.10  on 310  degrees of freedom
## AIC: 895.49
## 
## Number of Fisher Scoring iterations: 2


This shows that Medical Conditions does not have an influence on Self-Efficacy.

Conclusion

Now that we have those 4 models according to the demographic that we are interested in, let’s compare them against each other to see which models were best in determining Self-Efficacy.

## Parameter                     |        se_mod_1 |        se_mod_2 |         se_mod_3 |       se_mod_4
## -----------------------------------------------------------------------------------------------------
## (Intercept)                   |  4.26*** (0.12) |  4.10*** (0.15) |   2.40*** (0.44) | 3.94*** (0.06)
## major (Computer sci)          |     0.14 (0.39) |                 |                  |               
## major (Earth sci)             |     0.74 (1.00) |                 |                  |               
## major (Engineering)           |     0.02 (0.28) |                 |                  |               
## major (Chemistry)             |    -0.32 (0.39) |                 |                  |               
## major (Mathematics)           |    -0.16 (0.71) |                 |                  |               
## major (Non STEM)              |    -0.35 (0.24) |                 |                  |               
## major (Other)                 |   -0.50* (0.21) |                 |                  |               
## major (Health sci)            | -0.48*** (0.14) |                 |                  |               
## career (engineer)             |                 |     0.08 (0.30) |                  |               
## career (health care pro)      |                 |   -0.36* (0.17) |                  |               
## career (medical doctor)       |                 | 5.56e-04 (0.19) |                  |               
## career (educator)             |                 |    -0.29 (0.31) |                  |               
## career (researcher)           |                 |     0.41 (0.41) |                  |               
## career (scientist)            |                 |    -0.04 (0.32) |                  |               
## career (technician)           |                 |     0.20 (0.43) |                  |               
## career (non stem)             |                 |    -0.35 (0.27) |                  |               
## ethnicity (asian)             |                 |                 |    1.64** (0.55) |               
## ethnicity (biracial)          |                 |                 |   1.84*** (0.55) |               
## ethnicity (american indian)   |                 |                 |     1.40* (0.66) |               
## ethnicity (arab)              |                 |                 |     2.60* (1.07) |               
## ethnicity (pacific islander)  |                 |                 |      0.13 (0.71) |               
## ethnicity (prefer not answer) |                 |                 |      0.65 (0.66) |               
## ethnicity (latinx)            |                 |                 |    1.32** (0.50) |               
## ethnicity (other)             |                 |                 | -2.62e-14 (1.07) |               
## ethnicity (white)             |                 |                 |   1.61*** (0.44) |               
## med condition (Yes)           |                 |                 |                  |    0.04 (0.17)
## -----------------------------------------------------------------------------------------------------
## Observations                  |             312 |             312 |              312 |            312


This shows everything that we have performed thus far, but now let’s add some statistical analysis to it.

## # Comparison of Model Performance Indices
## 
## Name     | Model |     AIC | AIC weights |     BIC | BIC weights |        R2 |  RMSE | Sigma
## --------------------------------------------------------------------------------------------
## se_mod_1 |   glm | 892.545 |       0.009 | 929.975 |     < 0.001 |     0.053 | 0.980 | 0.994
## se_mod_2 |   glm | 896.478 |       0.001 | 933.908 |     < 0.001 |     0.041 | 0.986 | 1.000
## se_mod_3 |   glm | 883.148 |       0.988 | 924.321 |     < 0.001 |     0.087 | 0.962 | 0.978
## se_mod_4 |   glm | 895.491 |       0.002 | 906.720 |       1.000 | 1.391e-04 | 1.007 | 1.010


To put all of this in English, we are most interested in those that have higher R^2 values and lower BIC, AIC, and RMSE values. According to that knowledge, it looks like all models may be significant in predicting Self-Efficacy. Let’s see these together, and with that, we can see whether how they relate.

## 
## Call:
## glm(formula = self_efficacy_ave ~ major + ethnicity + career + 
##     med_condition, data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -3.1749  -0.3749   0.2055   0.6182   2.7232  
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 2.917731   0.479901   6.080 3.85e-09 ***
## majorChemistry             -0.502638   0.408743  -1.230 0.219818    
## majorComputer_sci           0.200983   0.478514   0.420 0.674791    
## majorEarth_sci              0.532371   1.055595   0.504 0.614417    
## majorEngineering            0.375505   0.577113   0.651 0.515789    
## majorHealth_sci            -0.417848   0.159684  -2.617 0.009352 ** 
## majorMathematics           -0.182061   0.698871  -0.261 0.794661    
## majorNon_STEM               0.062449   0.408616   0.153 0.878640    
## majorOther                 -0.405238   0.234018  -1.732 0.084418 .  
## ethnicityamerican_indian    1.471611   0.658626   2.234 0.026234 *  
## ethnicityarab               2.119737   1.072151   1.977 0.048995 *  
## ethnicityasian              1.632796   0.553118   2.952 0.003420 ** 
## ethnicitybiracial           1.642634   0.556800   2.950 0.003440 ** 
## ethnicitylatinx             1.209543   0.498482   2.426 0.015868 *  
## ethnicityother             -0.480263   1.072151  -0.448 0.654534    
## ethnicitypacific_islander   0.160269   0.713446   0.225 0.822420    
## ethnicityprefer_not_answer  0.445674   0.655239   0.680 0.496949    
## ethnicitywhite              1.494614   0.442234   3.380 0.000827 ***
## careereducator             -0.255685   0.338821  -0.755 0.451094    
## careerengineer             -0.588764   0.585327  -1.006 0.315330    
## careerhealth_care_pro      -0.223101   0.175434  -1.272 0.204512    
## careermedical_doctor       -0.037468   0.193706  -0.193 0.846764    
## careernon_stem             -0.699786   0.431068  -1.623 0.105614    
## careerresearcher            0.055284   0.446979   0.124 0.901653    
## careerscientist            -0.266564   0.333683  -0.799 0.425040    
## careertechnician           -0.003913   0.480447  -0.008 0.993508    
## med_conditionYes            0.103215   0.174851   0.590 0.555456    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for gaussian family taken to be 0.939999)
## 
##     Null deviance: 316.14  on 311  degrees of freedom
## Residual deviance: 267.90  on 285  degrees of freedom
## AIC: 893.87
## 
## Number of Fisher Scoring iterations: 2

Well awesome, this gives us a great place that we can start in further data analysis in our project.

NEXT